Skip to content

chore(docs): sitemap#38

Merged
coryrylan merged 1 commit into
mainfrom
topic-docs-sitemap-fix
Apr 29, 2026
Merged

chore(docs): sitemap#38
coryrylan merged 1 commit into
mainfrom
topic-docs-sitemap-fix

Conversation

@coryrylan
Copy link
Copy Markdown
Collaborator

@coryrylan coryrylan commented Apr 29, 2026

  • fix broken sitemap
  • add additional content schema/metadata
  • generate page titles
  • fix broken robots.txt

Summary by CodeRabbit

  • New Features

    • Automatic sitemap.xml generation during site builds.
    • Richer page metadata and SEO: canonical URLs, enhanced Open Graph tags, and structured JSON‑LD.
  • Chores

    • Updated robots.txt policy to explicit crawler rules and sitemap reference; removed the old robots file.
    • Externalized site URL configuration for build runs.
    • Minor content/frontmatter and page-context tweaks for examples and the homepage description.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Centralizes page metadata and JSON‑LD, derives BASE_URL from the new metadata module, adds a build‑only sitemap generator, removes passthrough copy of src/robots.txt, introduces ELEMENTS_SITE_URL, and replaces the deleted src/robots.txt with an expanded public/robots.txt. (48 words)

Changes

Cohort / File(s) Summary
Config & Build
projects/site/eleventy.config.js, projects/site/package.json
Eleventy config now imports BASE_URL from the metadata module, stops copying src/robots.txt, and conditionally registers the sitemap plugin only for build runs. wireit.build exposes ELEMENTS_SITE_URL env var with a default.
Metadata & Utilities
projects/site/src/_11ty/layouts/metadata.js, projects/site/src/_11ty/utils/env.js
New centralized metadata module exporting BASE_URL, escapeAttr, resolvePageMeta, and renderJsonLd. Added ELEMENTS_SITE_URL env constant with default https://nvidia.github.io.
Sitemap Plugin
projects/site/src/_11ty/plugins/sitemap-xml.js
New Eleventy plugin that collects render URLs, filters publishable paths, deduplicates and sorts them, and writes ./.11ty-vite/public/sitemap.xml with a unified <lastmod> timestamp.
Layouts & Templates
projects/site/src/_11ty/layouts/common.js, projects/site/src/docs/elements/_tabs/examples.11ty.js
renderBaseHead now uses resolvePageMeta, sanitizes attributes, adds canonical/OG tags and injected JSON‑LD. Examples tab pages set data.isExamplesTab = true. Local BASE_URL removed from common layout.
Robots & Content
projects/site/src/robots.txt (deleted), projects/site/public/robots.txt, projects/site/src/index.md
Removed src/robots.txt passthrough; added an expanded public/robots.txt with crawler-specific rules and Sitemap directive. Homepage frontmatter adds a description.

Sequence Diagram(s)

sequenceDiagram
    participant Build as Build Process
    participant Config as Eleventy Config
    participant Renderer as Page Renderer
    participant Metadata as Metadata Module
    participant JsonLD as JSON-LD Builder
    participant Collector as URL Collector
    participant Sitemap as Sitemap Plugin
    participant FS as Filesystem Output

    Build->>Config: start (ELEVENTY_RUN_MODE=build)
    Config->>Renderer: render pages
    Renderer->>Metadata: resolvePageMeta(data)
    Metadata-->>Renderer: meta {title, description, url, canonicalUrl, ogImage}
    Renderer->>JsonLD: renderJsonLd(data, meta)
    JsonLD-->>Renderer: JSON-LD script
    Renderer->>Collector: register page URLs
    Config->>Sitemap: eleventy.after hook (build only)
    Sitemap->>Collector: retrieve URLs
    Sitemap->>Sitemap: filter, dedupe, sort
    Sitemap->>FS: write ./.11ty-vite/public/sitemap.xml
    FS-->>Build: sitemap written
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~23 minutes

Poem

🐇 I hopped through metadata tonight,

Canonicals shining, JSON‑LD bright,
Sitemaps sewn from trailing hops,
Robots new rules at the stops,
The site now leaps — a joyous sight! 🥕✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'chore(docs): sitemap' is too generic and does not clearly summarize the comprehensive changes made. While sitemap is addressed, the PR also fixes robots.txt, adds schema/metadata, and generates page titles—making the title incomplete. Expand the title to better reflect the main changes, e.g., 'chore(docs): add sitemap generation, schema metadata, and fix robots.txt' or 'chore(docs): sitemap generation with metadata and robots.txt fixes'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch topic-docs-sitemap-fix

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/site/public/robots.txt`:
- Line 76: The robots.txt currently hardcodes the sitemap URL which can diverge
from configured origins; update robots.txt to derive the sitemap URL from the
existing site config by using ELEMENTS_SITE_URL and BASE_URL (or perform a
build-time/template substitution) so the Sitemap line points at
`${ELEMENTS_SITE_URL}${BASE_URL}sitemap.xml` (or explicitly document/keep the
hardcoded production value if intentional); change generation/location so the
robots.txt content is templated from the app config rather than embedding the
literal https://nvidia.github.io/elements/sitemap.xml.
- Line 50: The commented-out directive "# Disallow: /" under the ClaudeBot entry
is confusing; either remove this commented line entirely or replace it with a
clear comment explaining why it is commented (e.g., "intentionally allowed" or
"previously disallowed") or make it an active directive if you intend to block
crawling; locate the ClaudeBot section in projects/site/public/robots.txt and
update the "# Disallow: /" line accordingly.

In `@projects/site/src/_11ty/layouts/metadata.js`:
- Around line 38-45: Add JSDoc blocks for escapeAttr, resolvePageMeta, and
renderJsonLd that state expected parameter types, allowed/required fields,
return types, and any constraints (e.g., null/undefined handling, string
encoding expectations); within resolvePageMeta and renderJsonLd also document
the main decision branches (e.g., when page.frontmatter takes precedence, when
defaults are applied, and when JSON-LD is omitted) and the conditions that
trigger each branch; include notes about side effects (none) and error behavior
(throws vs. safe fallback) and add brief structured-log call points or log
message templates to indicate where decision outcomes are emitted for
observability (reference the exact function names escapeAttr, resolvePageMeta,
renderJsonLd and the key branch conditions in those functions).
- Line 1: The current import uses Node's platform-aware join (import { join }
from 'node:path') which yields backslashes on Windows and breaks
URL/breadcrumbs; change to use POSIX join so URLs always use forward
slashes—update the import so the module uses path.posix.join (or import join
from 'node:path/posix') and replace uses of join in this file (metadata.js) so
canonical and breadcrumb URL construction calls the POSIX join variant instead
of the platform join.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4c1e11a8-65c4-4195-9c93-53ad5ab5b591

📥 Commits

Reviewing files that changed from the base of the PR and between 99737bf and 8743076.

📒 Files selected for processing (10)
  • projects/site/eleventy.config.js
  • projects/site/package.json
  • projects/site/public/robots.txt
  • projects/site/src/_11ty/layouts/common.js
  • projects/site/src/_11ty/layouts/metadata.js
  • projects/site/src/_11ty/plugins/sitemap-xml.js
  • projects/site/src/_11ty/utils/env.js
  • projects/site/src/docs/elements/_tabs/examples.11ty.js
  • projects/site/src/index.md
  • projects/site/src/robots.txt
💤 Files with no reviewable changes (1)
  • projects/site/src/robots.txt

Comment thread projects/site/public/robots.txt Outdated
Comment thread projects/site/public/robots.txt Outdated
Comment thread projects/site/src/_11ty/layouts/metadata.js
Comment on lines +38 to +45
export function escapeAttr(value) {
return String(value ?? '')
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&#39;');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add JSDoc contracts for metadata helpers and decision branches.

Please document expected inputs/outputs and constraints for escapeAttr, resolvePageMeta, and renderJsonLd (and key decision branches) so build-time behavior is explicit and auditable.

As per coding guidelines, "Document agent capabilities, constraints, and expected inputs/outputs in code comments or docstrings" and "Use structured logging to track agent decision-making processes and state changes for debugging and monitoring".

Also applies to: 60-83, 89-133

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 38 - 45, Add JSDoc
blocks for escapeAttr, resolvePageMeta, and renderJsonLd that state expected
parameter types, allowed/required fields, return types, and any constraints
(e.g., null/undefined handling, string encoding expectations); within
resolvePageMeta and renderJsonLd also document the main decision branches (e.g.,
when page.frontmatter takes precedence, when defaults are applied, and when
JSON-LD is omitted) and the conditions that trigger each branch; include notes
about side effects (none) and error behavior (throws vs. safe fallback) and add
brief structured-log call points or log message templates to indicate where
decision outcomes are emitted for observability (reference the exact function
names escapeAttr, resolvePageMeta, renderJsonLd and the key branch conditions in
those functions).

@coryrylan coryrylan force-pushed the topic-docs-sitemap-fix branch from 8743076 to 1c981c9 Compare April 29, 2026 15:39
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
projects/site/src/_11ty/layouts/metadata.js (1)

1-5: ⚠️ Potential issue | 🟠 Major

Use POSIX path joining for URL construction (currently platform-dependent).

Line 1 and Line 5 still use node:path.join for URL paths; on Windows this can generate \ and break canonical/breadcrumb URLs.

Suggested fix
-import { join } from 'node:path';
+import { posix as pathPosix } from 'node:path';
@@
-export const BASE_URL = join('/', process.env.PAGES_BASE_URL ?? '', '/');
+const rawBaseUrl = (process.env.PAGES_BASE_URL ?? '').replace(/\\/g, '/');
+export const BASE_URL = pathPosix.join('/', rawBaseUrl, '/');
Does Node.js `path.join` use platform-specific separators, and should URL paths use `path.posix.join` instead?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 1 - 5, The code
builds BASE_URL using the platform-dependent path.join import (join) which can
produce backslashes on Windows; change the implementation to use POSIX joins for
URLs by replacing the import/use of join from 'node:path' with the POSIX variant
(e.g., path.posix.join or equivalent) or build the URL via URL/string templates
so BASE_URL (export const BASE_URL) always uses forward-slashes; update the
import and the BASE_URL expression accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/site/package.json`:
- Around line 67-70: Add documentation for the new build-time Wireit environment
variable ELEMENTS_SITE_URL to the BUILD.md docs: describe its purpose
(overriding the site URL during builds), list its default value
("https://nvidia.github.io"), indicate that it is a build-time/Wireit
environment variable, and show an example of how to set it for local CI/CD usage
and in the Wireit/CI configuration; update the BUILD.md section that covers
build env vars so ELEMENTS_SITE_URL appears alongside the other documented
variables.

In `@projects/site/public/robots.txt`:
- Around line 4-73: The explicit User-agent blocks (e.g., Googlebot, Bingbot,
OAI-SearchBot, ChatGPT-User, Claude-SearchBot, GPTBot, CCBot,
Meta-ExternalAgent, Applebot-Extended, etc.) currently only contain Allow
directives so they do not inherit the Disallow rules declared under "User-agent:
*", which leaves /assets/ and /.pagefind/ crawlable for those bots; fix this by
either adding the Disallow: /assets/ and Disallow: /.pagefind/ lines to each
explicit User-agent block (e.g., after the Allow: / lines for Googlebot,
Bingbot, OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User,
PerplexityBot, Perplexity-User, Amazonbot, YouBot, GPTBot, ClaudeBot,
Google-Extended, CCBot, Meta-ExternalAgent, FacebookBot, Applebot-Extended) or,
if the intent is identical policy for all bots, remove the redundant per-bot
blocks and keep a single User-agent: * block with Allow/Disallow rules to
enforce the intended restrictions consistently.

In `@projects/site/src/_11ty/layouts/common.js`:
- Around line 21-25: The template injects raw meta.canonicalUrl and meta.ogImage
into attributes; escape them before rendering to avoid quote/injection
issues—use the existing escapeAttr helper (same as used for meta.title and
meta.description) to replace the raw interpolations for href and og:url/content
(i.e., wrap meta.canonicalUrl and meta.ogImage with escapeAttr in the common.js
layout where those symbols are used).

In `@projects/site/src/_11ty/layouts/metadata.js`:
- Around line 92-104: The current fallback new Date() makes the JSON-LD dates
change every build; change the date assignment so it only uses a
source-controlled date (e.g., data.page.date when it exists and is a Date or a
parsable string) and do not default to new Date(); update the article object to
conditionally include datePublished and dateModified only when that
deterministic date is available (modify the const date and the article literal
where datePublished/dateModified are set), ensuring you neither inject a
build-time "now" nor leave invalid values in the JSON-LD.

---

Duplicate comments:
In `@projects/site/src/_11ty/layouts/metadata.js`:
- Around line 1-5: The code builds BASE_URL using the platform-dependent
path.join import (join) which can produce backslashes on Windows; change the
implementation to use POSIX joins for URLs by replacing the import/use of join
from 'node:path' with the POSIX variant (e.g., path.posix.join or equivalent) or
build the URL via URL/string templates so BASE_URL (export const BASE_URL)
always uses forward-slashes; update the import and the BASE_URL expression
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: decd9d3f-6feb-4e56-ae02-02d9b043ad66

📥 Commits

Reviewing files that changed from the base of the PR and between 8743076 and 1c981c9.

📒 Files selected for processing (10)
  • projects/site/eleventy.config.js
  • projects/site/package.json
  • projects/site/public/robots.txt
  • projects/site/src/_11ty/layouts/common.js
  • projects/site/src/_11ty/layouts/metadata.js
  • projects/site/src/_11ty/plugins/sitemap-xml.js
  • projects/site/src/_11ty/utils/env.js
  • projects/site/src/docs/elements/_tabs/examples.11ty.js
  • projects/site/src/index.md
  • projects/site/src/robots.txt
💤 Files with no reviewable changes (1)
  • projects/site/src/robots.txt

Comment on lines +67 to 70
"ELEMENTS_SITE_URL": {
"external": true,
"default": "https://nvidia.github.io"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Document ELEMENTS_SITE_URL in build docs.

ELEMENTS_SITE_URL was added to build-time Wireit env; please ensure /projects/internals/BUILD.md documents this variable and default value so CI/CD and local build docs stay aligned.

Based on learnings: "Review /projects/internals/BUILD.md when modifying build configuration, Wireit scripts, or CI/CD pipeline".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/package.json` around lines 67 - 70, Add documentation for the
new build-time Wireit environment variable ELEMENTS_SITE_URL to the BUILD.md
docs: describe its purpose (overriding the site URL during builds), list its
default value ("https://nvidia.github.io"), indicate that it is a
build-time/Wireit environment variable, and show an example of how to set it for
local CI/CD usage and in the Wireit/CI configuration; update the BUILD.md
section that covers build env vars so ELEMENTS_SITE_URL appears alongside the
other documented variables.

Comment thread projects/site/public/robots.txt Outdated
Comment on lines +21 to +25
<link rel="canonical" href="${meta.canonicalUrl}">
<meta property="og:title" content="${escapeAttr(meta.title)}">
<meta property="og:url" content="${meta.canonicalUrl}">
<meta property="og:description" content="${escapeAttr(meta.description)}">
<meta property="og:image" content="${meta.ogImage}">
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Escape canonicalUrl and ogImage before injecting into HTML attributes.

Line 21, Line 23, and Line 25 interpolate raw values into href/content. If these strings ever contain quotes, this can break the head markup and create an injection vector.

Proposed hardening
-  <link rel="canonical" href="${meta.canonicalUrl}">
+  <link rel="canonical" href="${escapeAttr(meta.canonicalUrl)}">
   <meta property="og:title" content="${escapeAttr(meta.title)}">
-  <meta property="og:url" content="${meta.canonicalUrl}">
+  <meta property="og:url" content="${escapeAttr(meta.canonicalUrl)}">
   <meta property="og:description" content="${escapeAttr(meta.description)}">
-  <meta property="og:image" content="${meta.ogImage}">
+  <meta property="og:image" content="${escapeAttr(meta.ogImage)}">
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/common.js` around lines 21 - 25, The template
injects raw meta.canonicalUrl and meta.ogImage into attributes; escape them
before rendering to avoid quote/injection issues—use the existing escapeAttr
helper (same as used for meta.title and meta.description) to replace the raw
interpolations for href and og:url/content (i.e., wrap meta.canonicalUrl and
meta.ogImage with escapeAttr in the common.js layout where those symbols are
used).

Comment on lines +92 to +104
const date = data.page?.date instanceof Date ? data.page.date.toISOString() : new Date().toISOString();

const article = {
'@context': 'https://schema.org',
'@type': articleType,
headline: meta.title,
description: meta.description,
url: meta.canonicalUrl,
mainEntityOfPage: meta.canonicalUrl,
inLanguage: 'en',
image: meta.ogImage,
datePublished: date,
dateModified: date,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid build-time “now” as JSON-LD publish/modify dates.

Line 92 falls back to new Date(); this makes datePublished/dateModified change on every build and can misrepresent content freshness.

Suggested fix
-  const date = data.page?.date instanceof Date ? data.page.date.toISOString() : new Date().toISOString();
+  const date = data.page?.date instanceof Date ? data.page.date.toISOString() : null;
@@
-    datePublished: date,
-    dateModified: date,
+    ...(date ? { datePublished: date, dateModified: date } : {}),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const date = data.page?.date instanceof Date ? data.page.date.toISOString() : new Date().toISOString();
const article = {
'@context': 'https://schema.org',
'@type': articleType,
headline: meta.title,
description: meta.description,
url: meta.canonicalUrl,
mainEntityOfPage: meta.canonicalUrl,
inLanguage: 'en',
image: meta.ogImage,
datePublished: date,
dateModified: date,
const date = data.page?.date instanceof Date ? data.page.date.toISOString() : null;
const article = {
'@context': 'https://schema.org',
'@type': articleType,
headline: meta.title,
description: meta.description,
url: meta.canonicalUrl,
mainEntityOfPage: meta.canonicalUrl,
inLanguage: 'en',
image: meta.ogImage,
...(date ? { datePublished: date, dateModified: date } : {}),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 92 - 104, The
current fallback new Date() makes the JSON-LD dates change every build; change
the date assignment so it only uses a source-controlled date (e.g.,
data.page.date when it exists and is a Date or a parsable string) and do not
default to new Date(); update the article object to conditionally include
datePublished and dateModified only when that deterministic date is available
(modify the const date and the article literal where datePublished/dateModified
are set), ensuring you neither inject a build-time "now" nor leave invalid
values in the JSON-LD.

- fix broken sitemap
- add additional content schema/metadata
- generate page titles
- fix broken robots.txt

Signed-off-by: Cory Rylan <crylan@nvidia.com>
@coryrylan coryrylan force-pushed the topic-docs-sitemap-fix branch from 1c981c9 to 70a0f1c Compare April 29, 2026 16:07
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
projects/site/public/robots.txt (1)

6-6: ⚠️ Potential issue | 🟠 Major

Hardcoded sitemap URL can diverge from configured deployment URL.

Line 6 bypasses the new ELEMENTS_SITE_URL/base-url configuration path. For non-production or alternate deployments, robots.txt can point to the wrong sitemap location. Please generate this value from the same build-time config used by sitemap generation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/public/robots.txt` at line 6, The robots.txt currently
hardcodes the sitemap URL; change it to use the same build-time site URL used by
sitemap generation (ELEMENTS_SITE_URL / base-url) so alternate deployments
match; update the robots.txt generation step to inject or template the sitemap
value from the existing config (the same source used by sitemap generation), and
include a sensible fallback if ELEMENTS_SITE_URL is missing so the generated
robots.txt always points to the correct sitemap for non-production or alternate
deployments.
projects/site/src/_11ty/layouts/common.js (1)

21-25: ⚠️ Potential issue | 🟠 Major

Escape URL/image metadata before injecting into head attributes.

Lines 21, 23, and 25 still render raw attribute values. Use escapeAttr for meta.canonicalUrl and meta.ogImage to avoid malformed head markup and injection risk.

Proposed fix
-  <link rel="canonical" href="${meta.canonicalUrl}">
+  <link rel="canonical" href="${escapeAttr(meta.canonicalUrl)}">
@@
-  <meta property="og:url" content="${meta.canonicalUrl}">
+  <meta property="og:url" content="${escapeAttr(meta.canonicalUrl)}">
@@
-  <meta property="og:image" content="${meta.ogImage}">
+  <meta property="og:image" content="${escapeAttr(meta.ogImage)}">
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/common.js` around lines 21 - 25, The
canonical URL and og:image values are injected without escaping; update the
template in common.js to use escapeAttr for these attributes so head markup is
safe—replace occurrences of ${meta.canonicalUrl} with
${escapeAttr(meta.canonicalUrl)} (used in the <link rel="canonical"> href and
the <meta property="og:url"> content) and replace ${meta.ogImage} with
${escapeAttr(meta.ogImage)} for <meta property="og:image">; ensure you reuse the
existing escapeAttr helper so the values are properly escaped before rendering.
projects/site/package.json (1)

67-70: ⚠️ Potential issue | 🟡 Minor

Document the new build env var to prevent CI/local config drift.

Line 67 introduces ELEMENTS_SITE_URL for the build pipeline, but the corresponding build docs update is not included here. Please add/update /projects/internals/BUILD.md with purpose, default, and usage examples.

Based on learnings: "Review /projects/internals/BUILD.md when modifying build configuration, Wireit scripts, or CI/CD pipeline".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/package.json` around lines 67 - 70, Add documentation for the
new build environment variable ELEMENTS_SITE_URL to the BUILD.md doc: describe
its purpose (override the production site URL used during builds), state the
default value ("https://nvidia.github.io"), show example usages for CI
(export/setting in pipeline) and local development (env file and Wireit/dev
script examples), and include a note to update any CI/Wireit examples that
reference site URL so config drift is prevented; ensure the entry is placed in
the environment variables or build configuration section of BUILD.md and follows
the existing formatting/style.
projects/site/src/_11ty/layouts/metadata.js (3)

38-45: 🛠️ Refactor suggestion | 🟠 Major

Add JSDoc contracts and structured decision logging for metadata helpers.

escapeAttr, resolvePageMeta, and renderJsonLd currently lack explicit input/output constraints and branch-level observability.

As per coding guidelines, "Document agent capabilities, constraints, and expected inputs/outputs in code comments or docstrings" and "Use structured logging to track agent decision-making processes and state changes for debugging and monitoring".

Also applies to: 60-83, 89-133

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 38 - 45, Add
explicit JSDoc contracts and minimal structured decision logging for the
metadata helpers: annotate escapeAttr, resolvePageMeta, and renderJsonLd with
JSDoc describing parameter types, allowed/required fields, return types, and any
side-effects or exceptions; inside resolvePageMeta and renderJsonLd emit
structured logs (e.g., logger.debug/info with JSON payloads) at branch
points—when inputs are missing/fallbacks are used, when canonical/og/json-ld
fields are chosen, and when escaping occurs—so each decision includes the
function name and key values; ensure logs are concise and non-sensitive, and
keep JSDoc and logging calls next to the corresponding function declarations
(escapeAttr, resolvePageMeta, renderJsonLd).

1-5: ⚠️ Potential issue | 🟠 Major

Use POSIX URL joining for BASE_URL construction.

Line 1 and Line 5 use node:path.join, which is platform-dependent and can generate \ on Windows, corrupting canonical/sitemap URLs.

Suggested fix
-import { join } from 'node:path';
+import { posix as pathPosix } from 'node:path';
@@
-export const BASE_URL = join('/', process.env.PAGES_BASE_URL ?? '', '/');
+const rawBaseUrl = (process.env.PAGES_BASE_URL ?? '').replace(/\\/g, '/');
+export const BASE_URL = pathPosix.join('/', rawBaseUrl, '/');
Does Node.js `path.join` use platform-specific separators, and should URL paths use `path.posix.join`?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 1 - 5, The BASE_URL
construction uses node:path's platform-dependent join which can produce
backslashes on Windows; replace the import of join from 'node:path' with the
POSIX variant (e.g., use path.posix.join or import { posix as pathPosix } and
call pathPosix.join) so BASE_URL is built with forward slashes, and update the
BASE_URL definition (the exported BASE_URL constant) to use that POSIX join to
avoid backslashes in canonical/sitemap URLs.

92-104: ⚠️ Potential issue | 🟠 Major

Do not default JSON-LD dates to build-time now.

Line 92 makes datePublished/dateModified change on every build, which misrepresents content freshness.

Suggested fix
-  const date = data.page?.date instanceof Date ? data.page.date.toISOString() : new Date().toISOString();
+  const date = data.page?.date instanceof Date ? data.page.date.toISOString() : null;
@@
-    datePublished: date,
-    dateModified: date,
+    ...(date ? { datePublished: date, dateModified: date } : {}),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/layouts/metadata.js` around lines 92 - 104, The code
currently sets date = data.page?.date instanceof Date ?
data.page.date.toISOString() : new Date().toISOString(), which causes
datePublished/dateModified to become the build-time "now"; instead, only set
date when a real page date exists and do not fall back to the build timestamp.
Change the logic around the date variable and the article object so date is
derived from data.page?.date (or left undefined/null when missing), and ensure
article.datePublished and article.dateModified are only assigned when that page
date is present (referencing the date variable, article, datePublished,
dateModified, and data.page?.date to find where to edit).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/site/src/_11ty/plugins/sitemap-xml.js`:
- Around line 21-23: The <loc> element is being populated with raw URL text
(constructed via SITE_ORIGIN, PATH_PREFIX, url into the loc variable) which can
break XML if the URL contains characters like &, <, >, ", or '. Fix by
HTML/XML-escaping the loc value before serialization: add a small helper (e.g.,
escapeXml or encodeXml) that replaces &, <, >, ", and ' with their XML entities
and call it when creating the <loc> string (use the escaped value instead of loc
in the array passed to join). Ensure the helper is reused wherever sitemap <loc>
values are produced.

---

Duplicate comments:
In `@projects/site/package.json`:
- Around line 67-70: Add documentation for the new build environment variable
ELEMENTS_SITE_URL to the BUILD.md doc: describe its purpose (override the
production site URL used during builds), state the default value
("https://nvidia.github.io"), show example usages for CI (export/setting in
pipeline) and local development (env file and Wireit/dev script examples), and
include a note to update any CI/Wireit examples that reference site URL so
config drift is prevented; ensure the entry is placed in the environment
variables or build configuration section of BUILD.md and follows the existing
formatting/style.

In `@projects/site/public/robots.txt`:
- Line 6: The robots.txt currently hardcodes the sitemap URL; change it to use
the same build-time site URL used by sitemap generation (ELEMENTS_SITE_URL /
base-url) so alternate deployments match; update the robots.txt generation step
to inject or template the sitemap value from the existing config (the same
source used by sitemap generation), and include a sensible fallback if
ELEMENTS_SITE_URL is missing so the generated robots.txt always points to the
correct sitemap for non-production or alternate deployments.

In `@projects/site/src/_11ty/layouts/common.js`:
- Around line 21-25: The canonical URL and og:image values are injected without
escaping; update the template in common.js to use escapeAttr for these
attributes so head markup is safe—replace occurrences of ${meta.canonicalUrl}
with ${escapeAttr(meta.canonicalUrl)} (used in the <link rel="canonical"> href
and the <meta property="og:url"> content) and replace ${meta.ogImage} with
${escapeAttr(meta.ogImage)} for <meta property="og:image">; ensure you reuse the
existing escapeAttr helper so the values are properly escaped before rendering.

In `@projects/site/src/_11ty/layouts/metadata.js`:
- Around line 38-45: Add explicit JSDoc contracts and minimal structured
decision logging for the metadata helpers: annotate escapeAttr, resolvePageMeta,
and renderJsonLd with JSDoc describing parameter types, allowed/required fields,
return types, and any side-effects or exceptions; inside resolvePageMeta and
renderJsonLd emit structured logs (e.g., logger.debug/info with JSON payloads)
at branch points—when inputs are missing/fallbacks are used, when
canonical/og/json-ld fields are chosen, and when escaping occurs—so each
decision includes the function name and key values; ensure logs are concise and
non-sensitive, and keep JSDoc and logging calls next to the corresponding
function declarations (escapeAttr, resolvePageMeta, renderJsonLd).
- Around line 1-5: The BASE_URL construction uses node:path's platform-dependent
join which can produce backslashes on Windows; replace the import of join from
'node:path' with the POSIX variant (e.g., use path.posix.join or import { posix
as pathPosix } and call pathPosix.join) so BASE_URL is built with forward
slashes, and update the BASE_URL definition (the exported BASE_URL constant) to
use that POSIX join to avoid backslashes in canonical/sitemap URLs.
- Around line 92-104: The code currently sets date = data.page?.date instanceof
Date ? data.page.date.toISOString() : new Date().toISOString(), which causes
datePublished/dateModified to become the build-time "now"; instead, only set
date when a real page date exists and do not fall back to the build timestamp.
Change the logic around the date variable and the article object so date is
derived from data.page?.date (or left undefined/null when missing), and ensure
article.datePublished and article.dateModified are only assigned when that page
date is present (referencing the date variable, article, datePublished,
dateModified, and data.page?.date to find where to edit).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 642ad5e9-abf2-4683-a259-37ff5009ef4c

📥 Commits

Reviewing files that changed from the base of the PR and between 1c981c9 and 70a0f1c.

📒 Files selected for processing (10)
  • projects/site/eleventy.config.js
  • projects/site/package.json
  • projects/site/public/robots.txt
  • projects/site/src/_11ty/layouts/common.js
  • projects/site/src/_11ty/layouts/metadata.js
  • projects/site/src/_11ty/plugins/sitemap-xml.js
  • projects/site/src/_11ty/utils/env.js
  • projects/site/src/docs/elements/_tabs/examples.11ty.js
  • projects/site/src/index.md
  • projects/site/src/robots.txt
💤 Files with no reviewable changes (1)
  • projects/site/src/robots.txt

Comment on lines +21 to +23
const loc = `${SITE_ORIGIN}${PATH_PREFIX}${url}`;
return ['<url>', `<loc>${loc}</loc>`, `<lastmod>${lastmod}</lastmod>`, '</url>'].join('\n');
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Escape <loc> values before XML serialization.

Line 22 writes raw URL text into XML. If a URL contains XML-sensitive characters, sitemap output becomes invalid.

Proposed hardening
+const escapeXml = value =>
+  value
+    .replaceAll('&', '&amp;')
+    .replaceAll('<', '&lt;')
+    .replaceAll('>', '&gt;')
+    .replaceAll('"', '&quot;')
+    .replaceAll("'", '&apos;');
+
 export function sitemapPlugin(eleventyConfig) {
   eleventyConfig.on('eleventy.after', async ({ results } = {}) => {
@@
     const entries = urls.map(url => {
       const loc = `${SITE_ORIGIN}${PATH_PREFIX}${url}`;
-      return ['<url>', `<loc>${loc}</loc>`, `<lastmod>${lastmod}</lastmod>`, '</url>'].join('\n');
+      return ['<url>', `<loc>${escapeXml(loc)}</loc>`, `<lastmod>${lastmod}</lastmod>`, '</url>'].join('\n');
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const loc = `${SITE_ORIGIN}${PATH_PREFIX}${url}`;
return ['<url>', `<loc>${loc}</loc>`, `<lastmod>${lastmod}</lastmod>`, '</url>'].join('\n');
});
const loc = `${SITE_ORIGIN}${PATH_PREFIX}${url}`;
return ['<url>', `<loc>${escapeXml(loc)}</loc>`, `<lastmod>${lastmod}</lastmod>`, '</url>'].join('\n');
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/site/src/_11ty/plugins/sitemap-xml.js` around lines 21 - 23, The
<loc> element is being populated with raw URL text (constructed via SITE_ORIGIN,
PATH_PREFIX, url into the loc variable) which can break XML if the URL contains
characters like &, <, >, ", or '. Fix by HTML/XML-escaping the loc value before
serialization: add a small helper (e.g., escapeXml or encodeXml) that replaces
&, <, >, ", and ' with their XML entities and call it when creating the <loc>
string (use the escaped value instead of loc in the array passed to join).
Ensure the helper is reused wherever sitemap <loc> values are produced.

@coryrylan coryrylan merged commit c43ad8f into main Apr 29, 2026
8 checks passed
@coryrylan coryrylan deleted the topic-docs-sitemap-fix branch April 29, 2026 16:21
@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.11 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.1.2 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

3 similar comments
@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.10 🎉

Changelog

1 similar comment
@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.10 🎉

Changelog

@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

1 similar comment
@coryrylan
Copy link
Copy Markdown
Collaborator Author

🎉 This issue has been resolved in version 0.0.8 🎉

Changelog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants